58 research outputs found

    Learning Invariant Representations for Deep Latent Variable Models

    Get PDF
    Deep latent variable models introduce a new class of generative models which are able to handle unstructured data and encode non-linear dependencies. Despite their known flexibility, these models are frequently not invariant against target-specific transformations. Therefore, they suffer from model mismatches and are challenging to interpret or control. We employ the concept of symmetry transformations from physics to formally describe these invariances. In this thesis, we investigate how we can model invariances when a symmetry transformation is either known or unknown. As a consequence, we make contributions in the domain of variable compression under side information and generative modelling. In our first contribution, we investigate the problem where a symmetry transformation is known yet not implicitly learned by the model. Specifically, we consider the task of estimating mutual information in the context of the deep information bottleneck which is not invariant against monotone transformations. To address this limitation, we extend the deep information bottleneck with a copula construction. In our second contribution, we address the problem of learning target-invariant subspaces for generative models. In this case, the symmetry transformation is unknown and has to be learned from data. We achieve this by formulating a deep information bottleneck with a target and a target-invariant subspace. To ensure invariance, we provide a continuous mutual information regulariser based on adversarial training. In our last contribution, we introduce an improved method for learning unknown symmetry transformations with cycle-consistency. To do so, we employ the equivalent deep information bottleneck method with a partitioned latent space. However, we ensure target-invariance by utilizing a cycle-consistency loss in the latent space. As a result, we overcome potential convergence issues introduced by adversarial training and are able to deal with mixed data. In summary, each of our presented models provide an attempt to better control and understand deep latent variables models by learning symmetry transformations. We demonstrated the effectiveness of our contributions with an extensive evaluation on both artificial and real-world experiments

    Informed MCMC with Bayesian Neural Networks for Facial Image Analysis

    Full text link
    Computer vision tasks are difficult because of the large variability in the data that is induced by changes in light, background, partial occlusion as well as the varying pose, texture, and shape of objects. Generative approaches to computer vision allow us to overcome this difficulty by explicitly modeling the physical image formation process. Using generative object models, the analysis of an observed image is performed via Bayesian inference of the posterior distribution. This conceptually simple approach tends to fail in practice because of several difficulties stemming from sampling the posterior distribution: high-dimensionality and multi-modality of the posterior distribution as well as expensive simulation of the rendering process. The main difficulty of sampling approaches in a computer vision context is choosing the proposal distribution accurately so that maxima of the posterior are explored early and the algorithm quickly converges to a valid image interpretation. In this work, we propose to use a Bayesian Neural Network for estimating an image dependent proposal distribution. Compared to a standard Gaussian random walk proposal, this accelerates the sampler in finding regions of the posterior with high value. In this way, we can significantly reduce the number of samples needed to perform facial image analysis.Comment: Accepted to the Bayesian Deep Learning Workshop at NeurIPS 201

    Learning Channel Importance for High Content Imaging with Interpretable Deep Input Channel Mixing

    Full text link
    Uncovering novel drug candidates for treating complex diseases remain one of the most challenging tasks in early discovery research. To tackle this challenge, biopharma research established a standardized high content imaging protocol that tags different cellular compartments per image channel. In order to judge the experimental outcome, the scientist requires knowledge about the channel importance with respect to a certain phenotype for decoding the underlying biology. In contrast to traditional image analysis approaches, such experiments are nowadays preferably analyzed by deep learning based approaches which, however, lack crucial information about the channel importance. To overcome this limitation, we present a novel approach which utilizes multi-spectral information of high content images to interpret a certain aspect of cellular biology. To this end, we base our method on image blending concepts with alpha compositing for an arbitrary number of channels. More specifically, we introduce DCMIX, a lightweight, scaleable and end-to-end trainable mixing layer which enables interpretable predictions in high content imaging while retaining the benefits of deep learning based methods. We employ an extensive set of experiments on both MNIST and RXRX1 datasets, demonstrating that DCMIX learns the biologically relevant channel importance without scarifying prediction performance.Comment: Accepted @ DAGM German Conference on Pattern Recognition (GCPR) 202

    Learning Sparse Latent Representations with the Deep Copula Information Bottleneck

    Full text link
    Deep latent variable models are powerful tools for representation learning. In this paper, we adopt the deep information bottleneck model, identify its shortcomings and propose a model that circumvents them. To this end, we apply a copula transformation which, by restoring the invariance properties of the information bottleneck method, leads to disentanglement of the features in the latent space. Building on that, we show how this transformation translates to sparsity of the latent space in the new model. We evaluate our method on artificial and real data.Comment: Published as a conference paper at ICLR 2018. Aleksander Wieczorek and Mario Wieser contributed equally to this wor

    Learning Extremal Representations with Deep Archetypal Analysis

    Full text link
    Archetypes are typical population representatives in an extremal sense, where typicality is understood as the most extreme manifestation of a trait or feature. In linear feature space, archetypes approximate the data convex hull allowing all data points to be expressed as convex mixtures of archetypes. However, it might not always be possible to identify meaningful archetypes in a given feature space. Learning an appropriate feature space and identifying suitable archetypes simultaneously addresses this problem. This paper introduces a generative formulation of the linear archetype model, parameterized by neural networks. By introducing the distance-dependent archetype loss, the linear archetype model can be integrated into the latent space of a variational autoencoder, and an optimal representation with respect to the unknown archetypes can be learned end-to-end. The reformulation of linear Archetypal Analysis as deep variational information bottleneck, allows the incorporation of arbitrarily complex side information during training. Furthermore, an alternative prior, based on a modified Dirichlet distribution, is proposed. The real-world applicability of the proposed method is demonstrated by exploring archetypes of female facial expressions while using multi-rater based emotion scores of these expressions as side information. A second application illustrates the exploration of the chemical space of small organic molecules. In this experiment, it is demonstrated that exchanging the side information but keeping the same set of molecules, e. g. using as side information the heat capacity of each molecule instead of the band gap energy, will result in the identification of different archetypes. As an application, these learned representations of chemical space might reveal distinct starting points for de novo molecular design.Comment: Under review for publication at the International Journal of Computer Vision (IJCV). Extended version of our GCPR2019 paper "Deep Archetypal Analysis

    Burden of micronutrient deficiencies by socio-economic strata in children aged 6 months to 5 years in the Philippines

    Get PDF
    Background: Micronutrient deficiencies (MNDs) are a chronic lack of vitamins and minerals and constitute a huge public health problem. MNDs have severe health consequences and are particularly harmful during early childhood due to their impact on the physical and cognitive development. We estimate the costs of illness due to iron deficiency (IDA), vitamin A deficiency (VAD) and zinc deficiency (ZnD) in 2 age groups (6-23 and 24-59 months) of Filipino children by socio-economic strata in 2008. Methods: We build a health economic model simulating the consequences of MNDs in childhood over the entire lifetime. The model is based on a health survey and a nutrition survey carried out in 2008. The sample populations are first structured into 10 socio-economic strata (SES) and 2 age groups. Health consequences of MNDs are modelled based on information extracted from literature. Direct medical costs, production losses and intangible costs are computed and long term costs are discounted to present value. Results: Total lifetime costs of IDA, VAD and ZnD amounted to direct medical costs of 30 million dollars, production losses of 618 million dollars and intangible costs of 122,138 disability adjusted life years (DALYs). These costs can be interpreted as the lifetime costs of a 1-year cohort affected by MNDs between the age of 6–59 months. Direct medical costs are dominated by costs due to ZnD (89% of total), production losses by losses in future lifetime (90% of total) and intangible costs by premature death (47% of total DALY losses) and losses in future lifetime (43%). Costs of MNDs differ considerably between SES as costs in the poorest third of the households are 5 times higher than in the wealthiest third. Conclusions: MNDs lead to substantial costs in 6-59-month-old children in the Philippines. Costs are highly concentrated in the lower SES and in children 6-23 months old. These results may have important implications for the design, evaluation and choice of the most effective and cost-effective policies aimed at the reduction of MNDs
    • …
    corecore